A density-based cluster validity approach using multi-representatives

نویسندگان

  • Maria Halkidi
  • Michalis Vazirgiannis
چکیده

Although the goal of clustering is intuitively compelling and its notion arises in many fields, it is difficult to define a unified approach to address the clustering problem and thus diverse clustering algorithms abound in the research community. These algorithms, under different clustering assumptions, often lead to qualitatively different results. As a consequence the results of clustering algorithms (i.e. data set partitionings) need to be evaluated as regards their validity based on widely accepted criteria. In this paper a cluster validity index, CDbw, is proposed which assesses the compactness and separation of clusters defined by a clustering algorithm. The cluster validity index, given a data set and a set of clustering algorithms, enables: i) the selection of the input parameter values that lead an algorithm to the best possible partitioning of the data set, and ii) the selection of the algorithm that provides the best partitioning of the data set. CDbw handles efficiently arbitrarily shaped clusters by representing each cluster with a number of points rather than by a single representative point. A full implementation and experimental results confirm the reliability of the validity index showing also that its performance compares favourably to that of several others.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluating the validity of clustering results based on density criteria and multi-representatives

Although the goal of clustering is intuitively compelling and its notion arises in many fields, it has been difficult to define a unified approach to address the clustering problem and thus diverse clustering approaches abound in the research community. These approaches are based on different clustering principles and assumptions and they often lead to qualitatively different results. As a cons...

متن کامل

A Multi-Objective Approach to Fuzzy Clustering using ITLBO Algorithm

Data clustering is one of the most important areas of research in data mining and knowledge discovery. Recent research in this area has shown that the best clustering results can be achieved using multi-objective methods. In other words, assuming more than one criterion as objective functions for clustering data can measurably increase the quality of clustering. In this study, a model with two ...

متن کامل

Study of the Feasibility of Regeneration of Central Worn-Out Textures of Ilam City Based on Urban Smart Growth Approach

Deterioration of urban textures is an issue that most cities in Iran are faced with and the organization of such textures is of great importance. One of the safest and most practical solutions for urban planners is to assess the feasibility of regenerating such textures to determine the levels that can be regenerated and are relatively good and tolerable for urban society. Therefore, the presen...

متن کامل

Towards Effective and Efficient Distributed Clustering

Clustering has become an increasingly important task in modern application domains such as marketing and purchasing assistance, multimedia, molecular biology as well as many others. In many of these areas, the data are originally collected at different sites. In order to extract information out of these data, they are brought together and then clustered. In this paper, we propose a different ap...

متن کامل

Incremental Minimax Optimization based Fuzzy Clustering for Large Multi-view Data

Incremental clustering approaches have been proposed for handling large data when given data set is too large to be stored. The key idea of these approaches is to find representatives to represent each cluster in each data chunk and final data analysis is carried out based on those identified representatives from all the chunks. However, most of the incremental approaches are used for single vi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Pattern Recognition Letters

دوره 29  شماره 

صفحات  -

تاریخ انتشار 2008